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ABSTRACT 

LAGLIDADG homing endonucleases (LHEs) are com- 
pact endonucleases with 20-22 bp recognition sites, 
and thus are ideal scaffolds for engineering site- 
specific DNA cleavage enzymes for genome edit- 
ing applications. Here, we describe a general ap- 
proach to LHE engineering that combines ratio- 
nal design with directed evolution, using a yeast 
surface display high-throughput cleavage selection. 
This approach was employed to alter the binding 
and cleavage specificity of the l-Anil LHE to recog- 
nize a mutation in the mouse Bruton tyrosine kinase 
(Btk) gene causative for mouse X-linked immunod- 
eficiency (XID) — a model of human X-linked agam- 
maglobulinemia (XLA). The required re-targeting of 
l-Anil involved progressive resculpting of the DNA 
contact interface to accommodate nine base differ- 
ences from the native cleavage sequence. The en- 
zyme emerging from the progressive engineering 
process was specific for the XID mutant allele ver- 
sus the wild-type (WT) allele, and exhibited activity 
equivalent to WT l-Anil in vitro and in ceiiuio reporter 
assays. Fusion of the enzyme to a site-specific DNA 
binding domain of transcription activator-like effec- 
tor (TALE) resulted in a further enhancement of gene 
editing efficiency. These results illustrate the poten- 
tial of LHE enzymes as specific and efficient tools for 
therapeutic genome engineering. 



INTRODUCTION 

Homing endonucleases (HEs) are sequence- specific en- 
zymes that recognize and cleave DNA at long target sites 
(typically 20 bp). They are typically encoded within introns 
or luteins, and behave as mobile genetic elements that copy 
their genetic information into intron- or intein-less alleles 
of their host gene. This genetic mobility is catalyzed by HE 
endonuclease-mediated DNA double-strand breaks (DSBs) 
in intein/intron-less alleles of the host gene. This facilitates 
repair by homologous recombination using the intron- or 
intein-containing gene, resulting in copying of the intron or 
intein into the new allele site (1,2). 

LAGLIDADG homing endonucleases (LHEs) are a par- 
ticularly attractive system for the development of gene- 
specific reagents because they possess 20-22 bp recognition 
sites, and cleavage activity is tightly coupled to DNA target 
site recognition (1,3,4). A variety of approaches have been 
applied to generate LHE variants with new cleavage speci- 
ficities, most of them involving 'local' variant library gener- 
ation through random mutation or structural-based modifi- 
cation of the LHE protein interface that contacts the DNA 
target site, followed by selection based on DNA cleavage or 
recombination activities (5-8). These methods currently are 
able to generate variants with changes in cleavage specificity 
in a 'local' region of the LHE DNA/protein interface cover- 
ing a relatively small number of contiguous base pairs. For 
physiologic targets, where multiple base pair mismatches 
must be targeted by variants that possess alterations in adja- 
cent or overlapping regions of the DNA/protein interface, 
the engineering of LHE variants with high specificity and 
activity requires combinations of local changes that often 
include conflicting sets of amino acid (AA) changes in the 
interface. The development of methods to overcome limita- 
tions in large scale LHE re-specification is necessary to ex- 
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pand the application of LHEs to such extremely challenging 
targets. 

Using a yeast surface display high-throughput cleavage 
selection system (9), here we show the application of ra- 
tional design with directed evolution in a progressive ap- 
proach to achieve a large scale re-engineering of the I-Anil 
homing endonuclease to specifically recognize a unique se- 
quence in the mouse Bruton tyrosine kinase {Btk) gene dif- 
fering by 9 bp from the native I-Anil sequence. Fusion of 
this enzyme to a sequence- specific TALE DNA binding do- 
main was used to further increase the activity and specificity 
of the most refined variant for the XID target site. Taken 
together, our results provide a roadmap for engineering of 
LHEs to create highly specific and active reagents for ther- 
apeutic genome engineering. 

MATERIALS AND METHODS 

DNA constructs and substrates for binding and cleavage as- 
says 

The I-Anil scaffold used here is the Y2 variant, containing 
two additional mutations, F13Y and SlllY, which enhance 
both DNA-binding affinity and cleavage efficacy (10). The 
TALE repeat variable diresidue (RVD) arrays were assem- 
bled using Golden Gate TAL effector kit, and the TAL ef- 
fectors were fused to the N-terminus of XID through a Zn4 
linker (VGGS) (11,12). The 52 bp HE substrates were gen- 
erated by PGR using single-strand oligonucleotides as tem- 
plate with HE recognition sites in the middle flanked by 1 6 
bp primer binding site on each end. Bio tin and fluorophore 
labels were introduced by 5^ biotin-conjugated primer and J 
Alexa Fluor-647-conjugated primer, respectively. HE sub- 
strates were purified from single-stranded contaminants by 
Exol digestion (New England Biolabs) and size exclusion 
through a G-lOO column (GE Healthcare), then analyzed 
for purity by gel electrophoresis (determined to be >98%) 
(9). 

Yeast growth, transformation, library construction and plas- 
mid recovery 

Saccharomyces cerevisiae strain EBYIOO was transformed 
using the lithium-acetate (Li Ac) method (13). For random- 
ization library construction, randomization oHgos with de- 
generative code on selected bases were ordered from Sigma. 
After PGR amplification, oligos were cloned into pET- 
CON2 vector through homologous recombination in yeast. 
The distribution of codon frequencies was verified by se- 
quencing an unselected library and determined to exhibit 
no major biasing of the type at positions of randomization 
(9). For random mutagenesis library construction, error- 
prone PGR was performed over selected region of the I- 
Anil variant using the GeneMorph-II Random Mutagen- 
esis kit (Stratagene) according to the manufacturer's proto- 
col. Library size was determined by plating serial dilutions 
on selective plates. Mutation distribution and frequencies 
were verified by sequencing an unselected library and de- 
termined to be in the range of 7-10 mutations per kilo base 
with no major biases in the type or position of mutations. 
Yeast propagation was performed in the presence of 2% raf- 
finose + 0.1% glucose at 30°G for at least 12 h prior to in- 



duction. Gells were induced in 2% galactose for 2-3 h at 
30°G foflowed by 18-26 h at 20°G. Plasmids were isolated 
from yeast populations using the Zymoprep-II kit (Zymo 
Research) and transformed into Escherichia coli DH5a by 
heat shock for amplification and/or sequencing. Sequenc- 
ing was performed on 40-60 clones for a given selection out- 
put. 

Yeast surface cleavage and sorting 

The yeast surface-based cleavage assay has been described 
previously (9). In brief, -30-100 x 10^ (at least 3-fold the 
size of the input population) induced cells were stained first 
with 1:300 dilution biotinylated anti-HA (Govance), then 
with pre-conjugated streptavidin-PE:biotin-DNA-A647 in 
a yeast binding buffer containing 180 mM KGl, 10 mM 
NaGl, 10 mM HEPES, 0.2% bovine serum albumin (BSA), 
0.1% galactose, pH 7.5. Samples were then washed twice in 
the cleavage buffer containing 150 mM KGl, 10 mM NaGl, 
10 mM HEPES, 10 mM K-glutamine, 0.5 mg/ml BSA, pH 
8.25, and then transferred to cleavage buffer containing 7.5 
mM of either GaGl2 or MgGl2, and placed at 37°G for the 
indicated time points. The reaction was stopped by transfer- 
ring cells to three reaction volume cold staining buffer with 
1 :200 dilution FITG-conjugated anti-Myc antibody (LGL 
labs) for HE surface expression staining. The catalytic activ- 
ity of HEs was measured by the decrease in Alex647 fluores- 
cence intensity associated with dsOligo cleavage and release 
from yeast surface on a BD ARIAII cytometer, and result- 
ing data were analyzed using Flowjo software (Tree Star). 
For XID-Ani Hbraries, ~0.3-l% population with the high- 
est cleavage activity were sorted for enrichment, and each 
library was enriched for three times before final analysis. 

In vitro cleavage assay and cleavage specificity 

3x10^ displaying yeasts with an estimated concentration of 
1-10 nM in a 40 fxl reaction (assuming 10^-10^ molecules 
per yeast cell surface) (9,14) were incubated with cleavage 
buffer + 20 nM Alexa-647-conjugated dsOligo with 5 mM 
MgGl2, 5 mM dithiothreitol (DTT) and placed at 37°G for 
1 h. The reaction was stopped by adding 50 mM ethylene- 
diaminetetraacetic acid (EDTA) and DNA sample buffer. 
After spinning down, 20 fxl of supernatant was loaded on a 
10% non-denaturing polyacrylamide gel. HE cleavage sites 
are in the middle of oligo substrates, which will generate two 
products of the same size in cleavage assays. The cleavage 
product detected in in vitro cleavage assay is the J half with 
Alexa Fluor-647 label. Quantification was performed with 
an Odyssey infrared imaging system (Li-Gor Biosciences), 
and cleavage activity was calculated by ratio of cleaved sub- 
strate to total substrate. The specificity profiles of XID-Ani 
and WT-Ani were generated by determining the in vitro 
cleavage of the enzyme to all 60 possible target sequences 
with one of the three other bases at each position. The per- 
centage of cleavage was normalized to the cleavage of native 
target sequence. 

Yeast surface-based binding assay 

The yeast surface-based binding assay and affinity calcula- 
tion has been described previously (9). Briefly, 1x10^ dis- 
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playing yeasts were incubated in 50 \xX staining buffer con- 
taining biotin-labeled dsOligo ranging from 1 to 500 nM 
at 4°C for 2 h. After washing twice with excess staining 
buffer, yeasts were co-stained with streptavidin-PE (BD bio- 
sciences) and FITC-conjugated anti-Myc antibody for an- 
other 30 min. The binding affinity of HEs was measured 
by the median PE fluorescence value of around 10% se- 
lected population based on equal Myc epitope yeast surface 
expression on a BD LSRII cytometer, and resulting data 
were analyzed using Flowjo software. The median PE val- 
ues were plotted versus dsOHgo concentrations using the 
Levenberg-Marquardt (LM) algorithm in the VisualEn- 
zymics (SoftZymics) module for IGOR Pro 6 (WaveMet- 
rics). 

In cellulo assay in HEK293T cells and primary MEFs 

I- Anil (tgaggaggtttctctgtaaa), XID (agtgcctgtttctcttgact), 
Ani-XID (agtgcctgtttctcttgactctgaggaggtttctctgtaaa) and 
TALE-XID (tcacctttaaacttcaagaagtgcctgtttctcttgact) tar- 
get sites were inserted into Traffic Light Reporter (TLR) 
vector using standard molecular biology techniques, and 
corresponding lentivirus was produced as described previ- 
ously (15,16). Reporter HEK293T cells were generated by 
transducing cells with serial-diluted reporter lentiviral vec- 
tors (LVs) to obtain a population of cells with single copy 
chromosomal integration, and selected by adding 1 |xg/ml 
puromycin in the culture medium 48 h after transduction. 
The reporter cells were sorted against mCherry fluorescence 
to remove background resulting from integration errors. 
Open reading frames for WT-Ani, XID-Ani, and TALE- 
XID enzymes were amplified by PCR and ligated into the 
CVLlentiviral backbone, which also co-expresses blue flu- 
orescent protein (BFP) by T2A peptide linker. Open read- 
ing frame for the y repair exonuclease 2 (TREX2) was am- 
plified by PCR and ligated into either CVL lentiviral or 
pEndo backbone, which also co-express iRFP by T2A pep- 
tide linker. For TLR assay, 1.5 x 10^ human embryonic 
kidney (HEK) reporter cells were seeded in a 12-well plate, 
and transiently co-transfected with 0.8 ixg WT-Ani/XID- 
Ani/TALE-XID and TREX2 expression constructs using 
X-tremeGENE 9 DNA transfection reagent with manu- 
facturer protocols (Roche Applied Science). Seventy-two 
hours after transfection, the cleavage activity of enzymes 
was measured by the percentage of mCherry positive cells, 
which represents double-strand break-induced mutagenic 
non-homologous end-joining (NHEJ) events, within a BFP 
marked nuclease-expressing population. Genomic DNA 
was isolated from BFP marked cells, and the precise cleav- 
age rate of integrated target site was determined by sequenc- 
ing after PCR amplification and cloning (CloneJET^^ 
PCR Cloning Kit — Thermo Scientific). For homology- 
directed repair (HDR) assay, the IIRVD-TALE-XID nu- 
clease was amplified by PCR and ligated into the CVL 
lentiviral backbone with dl4GFP donor template. Seventy- 
two hours after transfection, HDR and NHEJ events were 
measured by the percentage of GFP and mCherry positive 
cells within a BFP positive population, respectively, and the 
precise gene modification rates within this population were 
determined by genomic sequencing. Mouse embryonic fi- 
broblasts (MEFs) were isolated from homozygous XID em- 



bryos at 12-14 days of gestation. MEFs were cultured in 
Dulbecco's Modified Eagle's medium supplemented with 2 
mM glutamine, 10 mM HEPES and 10% fetal bovine serum 
(FBS). For XID MEF experiments, 1.2 x 10^ cefls were 
plated over 6 cm dishes. The following day, cells were trans- 
duced with LVs expressing 6RVD-TALE-XID and TREX2 
in the presence of 4 ixg/ml polybrene. Seventy two hours 
post-transduction, BFP and iRFP double positive cells were 
sorted and re-seeded in a 6 cm dish. Ten days after transduc- 
tion, cells were harvested for genomic DNA. XID and its 
homologous sites were amplified from the harvested DNA 
and disruption rates were determined by genomic sequenc- 
ing. All PCR primers used for genomic amplification were 
Hsted in Supplementary Table S3. 

RESULTS 

Human XLA is a rare X-linked genetic disorder caused 
by mutations in the human BTK gene, which is expressed 
at all stages of B-lineage development and is required for 
pre-B cell expansion and mature B cell survival and ac- 
tivation (17,18). XLA patients lack mature B cefls and 
immunoglobulin, and experience recurrent bacterial infec- 
tions. Current life-long antibody replacement therapy is 
only partially effective, is expensive, and is associated with 
several long-term complications. While gene addition ther- 
apy with recombinant gammaretroviral and lentiviral vec- 
tors has shown promise (19,20), these approaches have the 
potential to cause insertional mutagenesis and gene expres- 
sion mis-regulation. An ideal method for therapy of XLA 
would be to directly repair the BTK mutation in hematopoi- 
etic stem cefls by double-strand break-induced homologous 
recombination (3). However, to achieve this efficiently re- 
quires the identification of a 5f/:-specific nuclease reagent 
with sufficient cleavage specificity for therapeutic use. 

Target selection and cluster-based engineering of an XID- 
specific variant of I- Anil 

To explore the potential of using LHEs as gene modification 
tools for therapy of XLA, we selected a single base pair mu- 
tation within Exon 2 of the murine Btk gene as target site 
(this mutation is found in the XID mouse, a murine model 
of human XLA) (19,20). We endeavored to re-program the 
specificity of the LHE, I-Anil, to uniquely target the mu- 
tant allele. This was a significant undertaking, as there are 
9 bp differences between wild-type (WT) I-Anil target se- 
quence and XID sequence (Figure lA), including multiple 
residues known to be extremely important for I-Anil activ- 
ity. Furthermore, targeting this site required engineering of 
A As that comprise the protein-DNA interface contacting 
both the —7 to —5 and the +5 to +7 positions of the target 
site, residues that previous combinatorial strategies elected 
to bypass due to the high number base pair contacts in these 
positions (21). 

Recently, a number of I-Anil variants have been isolated 
that cleave target sites with single base pair mismatches 
from the original I- Anil target (22). As the direct combi- 
nation of re-designed variants targeted to single base mis- 
match did not generate active enzymes (— 6C— 5C— 4T and 
+6T+7G) (data not shown), we selected a 'cluster'-based 
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i-Anii native site : TG AGGAGGTTTCTCTGTAAA 
Mouse XID site : AGTGCCTGTTTCTCTTGACT 

-10-9-8- 7 -6-5-4- 3-2-H-l-l-2-h3+4-l- 5-h6-h7+ 8H -9+10 

-10A-8T -6C-5C-4T +6T+7G +9C+10T 
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Rational design Randomization library Cleavage selection 




Variant characterization 
Design combination 



Random mutagenesis library Cleavage selection Analysis 

Figure 1. Target selection and cluster-based engineering of an XID-specific variant of I- Anil. (A) Alignment of the native I- Anil LHE recognition sequence 
with mouse XID site. Mismatches (red) were divided into four clusters. The single base XID mutation at —8 position (C^T) was shaded in gray. (B) DNA 
interface A A residues (red) targeted to DNA sequence mismatches (yellow) were selected for randomization. (C) The workflow for engineering I- Anil 
toward XID target site. Based on rational design, AA residues targeted to mismatch clusters were selected for randomization to generate yeast libraries. 
After cleavage selection, random mutations were introduced into selected active variants by error-prone PCR to generate random mutagenesis library for 
further cleavage selection. Finally, selected designs were characterized and combined to generate XID full site enzyme. 
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engineering strategy by dividing the XID sequence into four 
'cluster' of mismatched residues (— lOA— 8T, — 6C— 5C— 4T, 
+6T+7G and +9C+10T) based on enzyme structure and 
Ani/XID target sequence mismatch positions (Figure lA 
and B). Based on structural information, AA residues in- 
teracting with these clusters were selected for randomiza- 
tion. I-Anil variants bearing alterations in these residues 
were incorporated into a yeast surface expression vector, 
thereby taking advantage of the high homologous recombi- 
nation efficiency in yeast to generate an LHE yeast surface 
display hbrary. Using a previously reported yeast surface 
cleavage assay (9), the yeast library was subjected to three 
rounds of flow cytometry-based selection to enrich highly 
active variants. To further increase enzyme activity, random 
mutations were introduced by error prone PGR into open 
reading frames (ORFs) of variants emerging from primary 
screens targeting each cluster. After three rounds of cleav- 
age selection from the random mutation library, enzyme- 
expressing DNA vectors were isolated from the final en- 
riched populations, and individual clones were sequenced 
and characterized in vitro (Figure IC; with details of engi- 
neering experiments performed for each cluster provided in 
Supplementary Figures S1-S4). As shown in +6T+7G and 
— 6C— 5C— 4T cluster libraries (Supplementary Figure S2D, 
S3F, and Supplementary Table SI), variants with the high- 
est cleavage activity were highly enriched in the final selected 
population (up to 80% for a single variant), and these clones 
typically exhibited similar patterns of AA variation at the 
diversified positions. 

Combination of cluster designs into an enzyme able to cleave 
the XID target site 

Using the output of the cluster selections, we generated li- 
braries in which cluster designs were combined to create en- 
zymes with active plus (+) and negative (— ) half sites. How- 
ever, we found that directly combining active (+) and (— ) 
half-site designs from these libraries did not generate ac- 
tive full site variants, suggesting conformational conflict be- 
tween (+) and (— ) half-site designs (data not shown). There- 
fore, we evaluated partial half-site combinations, and were 
able to isolate a -6C-5C-4T+6T+7G+9C+10A-Ani vari- 
ant with moderate activity on yeast surface. Using this ORF 
as template for a random mutagenesis library (Supplemen- 
tary Table SI, Supplementary Figure S4A), we screened 
for variants that exhibited increased activity using our flow 
cytometry-based cleavage assay. From this screen, we iden- 
tified a R243W mutation that was highly enriched in the 
most active population (Supplementary Figure S4B and C). 
Structural analysis of the R243 residue indicated that it is 
positioned a short distance from the native +9 A:T pair, and 
mutation at that position to C:G pair causes steric clash be- 
tween this residue and +9 nucleotide pair. When combining 
(+) and (— ) site designs, we speculate that this steric clash af- 
fects the positioning of catalytic domain, an effect that can 
be compensated by the R243 W mutation. This conclusion is 
further supported by the ability of the R243W mutation to 
rescue all previously inactive XID variants, following which 
their cleavage activity could be further increased by random 
mutagenesis and selection (Supplementary Table SI, Sup- 
plementary Figure S4D and E). 



In vitro characterization of XID-Ani 

The final selected XID-Ani variant (enriched up to 80% 
in the final population) includes 31 AA alterations from 
WT-Ani, with nearly half of them (14) selected from ran- 
dom mutagenesis libraries (Supplementary Table SI, Fig- 
ure 2A). In a yeast surface-based cleavage assay, XID-Ani 
showed similar cleavage efficacy as WT-Ani without de- 
tectable activity toward the WT target (Figure 2B). Once 
dissociated from the yeast surface, XID-Ani also showed 
similar cleavage kinetics as WT-Ani in an in vitro cleavage 
assay (Figure 2C). We previously showed that the specificity 
of re-designed enzymes targeted to partial XID site (— 6C, 
+6T+7G) has significantly improved specificity compared 
with WT-Ani (Supplementary Figures S2D and S3B). To 
evaluate the specificity of the engineered XID-Ani enzyme, 
we compared one-off cleavage specificity profile of XID-Ani 
with WT-Ani, and also measured their binding affinity (Fig- 
ure 2D). 'One-off target site specificity for XID-Ani ranged 
from relatively high at 9 bp positions (where the efficiency of 
cleavage of any other three bases was less than 50% of the 
XID target base), to partial or complete degeneracy at 1 1 
positions (where at least one other base was cleaved with an 
efficiency >50% of the XID target site) (Figure 2F). This is 
a significant improvement from the WT-Ani, which showed 
partial or complete degeneracy at 19 positions (Figure 2E) 
(9), and is explained at least in part by the fact that the re- 
designed enzyme has a reduced affinity for the XID target 
site — thus, any single base pair mismatch is more likely to 
compromise binding to a sufficient extent that enzymatic 
activity is compromised. The improved specificity of XID- 
Ani using the highly sensitive flow cytometry assay is an im- 
portant achievement, as it emphasizes the capacity of the 
I-Anil scaffold to be engineered so as to achieve a high effi- 
ciency of on-site cleavage while reducing off-target cleavage 
over nearly the entire DNA/protein interface, an important 
consideration for therapeutic applications where specificity 
is paramount. 

In cellulo performance of XID-Ani 

To characterize the activity of the re-designed XID-Ani in a 
cell-based model, we used a previously described TLR sys- 
tem, that is able to report the capacity of an enzyme to gen- 
erate both mutagenic NHEJ and targeted HDR (16). The 
in cellulo cleavage activity of XID-Ani and WT-Ani were 
measured by monitoring each enzyme's abihty to generate 
mCherry positive cells when co-expressed with the y exonu- 
clease TREX2 (TREX2 degrades the y overhang of DSBs 
generated by HEs, and thus its overexpression leads to in- 
crease rates of end processing, and so an increased prob- 
ability that a cleavage event will be converted to a muta- 
genic outcome (23)). To facilitate direct in cellulo cleavage 
efficacy comparison, a reporter cell line in which both XID 
and Ani target sites were included in a single TLR was uti- 
lized for both enzymes. Reporter cell lines with single tar- 
get sites were used, in parallel, as controls. Both enzymes 
were stably expressed in HEK293T cells with similar levels 
of protein detected by western blot (Figure 3A). Similar to 
WT-Ani, XID-Ani only induced NHEJ events in 293T re- 
porter cells that possessed the Ani-XID target, but not the 
WT-Ani target site alone (Figure 3B, Supplementary Figure 
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Substrate [nM] 




Figure 2. In vitro characterization of XID-Ani enzyme. (A) The final selected XID-Ani variant has 3 1 AA mutations (red) from WT-Ani. (B) XID- Ani 
showed similar cleavage efficacy as WT-Ani on yeast toward its target sequence as indicated by allophycocyanin (APC) signal shift in the presence of Mg^^. 
In contrast, XID-Ani exhibits no activity toward the WT-Ani sequence. (C) XID-Ani demonstrated similar cleavage in vitro kinetics toward its target site 
as WT-Ani. Top bands show uncut dsOligo candidate HE substrates. Lower bands represent the 3' half of HE-cleaved dsOligo substrates detected on the 
basis of the Alexa Fluor-647 label. (D) XID-Ani showed lower binding affinity than WT-Ani toward their respective target sites on a yeast surface-based 
binding affinity titration assay. (E and F) One-off //i vitro cleavage profiles for WT and engineered XID-Ani. Upper panels show cleavage activity of WT-Ani 
and XID-Ani toward their respective targets and one-off sites in an in vitro cleavage assay. Lower panels show quantification of relative cleavage efficacy 
with cleavage toward Ani and XID sites set as 100%, respectively. Quantification and standard error were calculated from three independent experiments. 
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Figure 3. In cellulo performance of XID-Ani. (A) Equivalent expression levels of WT-Ani and XID-Ani in HEK293T cells. Cells were co-transfected 
with vectors expressing WT-Ani or XID-Ani and a vector expressing TREX2 with a c/^-linked BFP reporter. Seventy two hours later, protein expression 
was detected by western blot using an anti-HA antibody. p-Actin was used as loading control. (B) When co-expressed with TREX2, WT-Ani and XID- 
Ani induced a similar level of NHEJ events in 293T reporter cells containing WT-Ani and XID target sequences. The cleavage activity was measured as 
percentage of mCherry positive events within BFP expressing cells (or within the total cell population in mock-treated cells). Histogram showing gating of 
non-transfected (mock) or BFP expressing cells is displayed in upper right quadrant of each FACS plot. (C) Quantification of NHEJ activity was calculated 
from three independent experiments. 
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S5). The cleavage efficacy of XID-Ani in cellulo was similar 
to WT-Ani, consistent with their observed relative in vitro 
cleavage activities (Figure 3B and C). 

TALE DNA binding domain fusion with XID-Ani signifi- 
cantly increases its cleavage efficacy 

To further increase the efficiency and specificity of the XID- 
Ani enzyme, we used a recently developed TALE-LHE fu- 
sion hereafter referred to as megaTAL architecture (12). We 
created megaTAL fusion enzymes with TALE DNA bind- 
ing domains targeted to 6 or 1 1 bp of the native Btk lo- 
cus sequences located upstream of murine XID site in con- 
junction with a 7 bp spacer (Figure 4A). The cleavage ef- 
ficacy of these TALE-XID endonucleases was compared 
with XID-Ani LHE in a HEK293T TLR cell fine with inte- 
grated TALE-XID sequence (Figure 4A). With both TALE 
array fusions, the 'on-site' cleavage efficacy of TALE-XID 
was substantially increased as reported by the TLR flow cy- 
tometry readout (Figure 4B and C; with level of protein ex- 
pression demonstrated in Figure 4D). As cleavage measured 
by TLR cell lines has been found to under report true mu- 
tation rates in some cases (24), we also assessed cleavage at 
the XID target in this line via amplicon-based sequencing. 
Analysis of amplicon sequences demonstrated that nearly 
complete disruption of the XID site was achieved in cells co- 
expressing IIRVD-TALE-XID and TREX2 (27/28 read- 
outs, 96.4%). Importantly, the ratio of cleavage efficacy de- 
termined by genomic sequencing between different enzymes 
was consistent with the ratio indicated by the mCherry re- 
porter readout of the TLR, validating the use of the TLR as 
a tool for relative comparisons of enzymatic activity (data 
not shown). 

Because we were not able to compare relative binding 
affinity of 1 IRVD versus XID-Ani using the yeast surface- 
based binding assay (due to the difficulty of expressing the 
TALE domain on the yeast surface), we compared the rela- 
tive contribution of TALE and HE by combining different 
TALE domains and HE variants. Using a 17 RVD TALE 
CL538'^TCATTACACCTGCAGCT) (25), we were able 
to increase the cleavage activity of XID-Ani. We also ob- 
tained similar cleavage rates using another XID variant 
with nearly identical turnover rate (Xcat) but a significantly 
lower binding affinity and cleavage activity (~20% of XID- 
Ani in the TLR assay). Based on these findings, we argue 
that the binding affinity of this 17 RVD TALE is sufficient 
to bring HEs with different binding affinity to their max- 
imum activity. With the 1 1 RVD TALE (which has lower 
binding affinity than 'L538'), the cleavage efficacy of this 
XID variant was around 70% of XID-Ani (data not shown). 
Based on this observation, we estimate the binding affinity 
of 11 RVD contributes 70-80% of the total observed activ- 
ity, which is consistent with the data showed in Figure 4C. 

We next determined the cleavage efficacy of the 6RVD- 
TALE-XID enzyme at the XID mutant Btk genomic lo- 
cus. Primary MEFs derived from XID embryos were co- 
transduced by LVs expressing the 6RVD-TALE-XID and 
TREX2 (multipHcity of infection (MOI) of 10). Of note, the 
codons of the highly repetitive RVD array sequences were 
diverged to reduce sequence rearrangements occurring dur- 
ing reverse transcription (26) thereby permitting efficient 



LV packaging and expression of this novel nuclease without 
evidence for protein degradation (data not shown). Ten days 
after transduction, native XID target disruption rate was 
determined by genomic sequencing within cells marked by 
both viruses. Although the XID site is thought to be within 
the silent Btk locus in primary MEFs, nearly 40% disrup- 
tion (21/53 readouts, 39.6%) was detected in XID MEFs, 
primarily including small deletions within the central four 
bases of XID-Ani enzyme recognition site (Figure 4E). 

Off-target cleavage of TALE-XID enzyme 

One of the most important considerations for therapeutic 
apphcation of rare-cutting endonucleases is maintaining a 
rate of off-target cleavage that is as low as possible (27). Al- 
though XID-Ani showed significantly improved specificity 
from WT-Ani, cleavage profiling identified 19 one-off cleav- 
age sites tolerated by XID-Ani in vitro (based on >50% pre- 
dicted cleavage activity at candidate target nucleotides; Fig- 
ure 2F). To examine the specificity of TALE-XID when ex- 
pressed in vivo, we searched for potential cleavage sites iden- 
tified in the mouse genome. We identified 23 potential sites 
including: two sites with a 1 bp mismatch and 21 sites with 
2 bp mismatches. Sequence and chromosomal positions of 
these sites are provided in Supplementary Table S2. The 
one-off cleavage profile predicted that the 1 bp mismatch 
sites (including the WT btk sequence) would be partially 
tolerated. Among the 2 bp mismatches, the cleavage profile 
predicted that seven sites would be tolerated (both single 
mismatches tolerated by XID-Ani) and 14 sites would not 
be tolerated (at least one mismatch is not tolerated). 

To directly test the rate of cleavage at predicted off- 
target sites in vivo, seven potential sites including candi- 
dates from each of these three categories were sequenced 
from XID MEF genomic DNA isolated from a cell popula- 
tion in which a 40% XID site disruption had been achieved 
by LV TALE-XID expression. As shown in Figure 5 A, 
none of these sites possessed upstream homologous TALE 
binding sequences. No disruption was found at sites that 
were predicted in vitro to be cleavage resistant (— lOT+lOA, 
— 2C+1A, +4T+8G). As for sites for which in vitro cleavage 
was observed, we detected no disruption at the — 6A+5C 
and +5C+9T sites, but 7.3% (3/41 readouts) disruption at 
the -10G-8A site (Figure 5B). The tolerance of the -lOG- 
8A site is likely due to their positions on nearby residues 
— 10 and —8 within the N terminal loop, where WT I- Anil 
has little specificity, while the — 6A+5C and +5C+9T are 
targeted by distinct domains of XID-Ani and, thus, are 
more likely to be incompatible with cleavage when com- 
bined together. Finally, we observed low-level, 2.1% cleav- 
age activity at the partially tolerated — 5T site. Importantly, 
the disruption rate for — lOG— 8A and — 5T sites was at least 
5-fold lower than that of the XID site, consistent with the 
observed difference in cleavage efficacy of XID-Ani with 
and without the fused TALE DNA binding domain in the 
TLR assay (Figure 4C). Considering the stringency of the 
assay, which enhances detection of nuclease-mediated dis- 
ruption via extended lentiviral vector driven nuclease and 
TREX2 co-expression, these combined results further illus- 
trate the importance of the TALE domain in specifically en- 
hancing 'on site' activity. 
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Figure 4. Fusion of TALE DNA binding domain with XID-Ani significantly increases its cleavage efficacy. (A) Schematic illustration of TALE-XID 
enzymes (6RVD and IIRVD) and their recognition sequences in mouse XID native locus (green: TALE binding site; red: XID site). (B) TALE-XID 
enzymes with 6RVDs or llRVDs significantly increased cleavage efficacy in HEK293T TLR cells containing the native XID locus sequence when co- 
expressed with TREX2. Reporter cells were co-transfected with vectors expressing TALE-XID enzymes and a vector expressing TREX2 with a c/^-linked 
BFP reporter and cleavage activity was measured as percentage of mCherry positive events within BFP expressing cells. Histogram showing BFP expression 
is displayed in upper right quadrant of each FACS plot. (C) Quantification of data was calculated from three independent experiments. The cleavage activity 
of XID-Ani based on percentage of mCherry was set as one (D) XID-Ani, 6RVD-TALE-XID and 1 IRVD-TALE-XID were stably expressed in HEK293T 
cells without degradation as detected by western blot analysis using an anti-HA antibody. p-Actin was used as loading control. (E) In the presence of 
TREX2, transduction using a LV expressing codon diverged 6RVD-TALE-XID disrupts ~40% of endogenous XID sites in primary XID MEFs. Upper 
panel shows schematic of experimental approach and lower panel displays genomic sequencing data from disrupted loci. 
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XID : TCACCTTTAAACTTCAAGAAGTGCCTGTTTCTCTTGACT 
-10G-8A:TTCATGCACACCCACCCTTGGAGCCTGTTTCTCTTGACT 
-6A+5C : AGCTATTCTCTTCCAGCACAGTGACTGTTTCTCCTGACT 
+5C+9T : GAATATGGCTTATTTTATTAGTGCCTGTTTCTCCTGATT 
-5T : TATATAACTTTGCCTTCATAGTGCTTGTTTCTCTTGACT 
- 1 0T+ 1 OA : CTAGCTGCGATGCCAGCCCTGTGCCTGTTTCTCTTGACA 
-2C+1A : TAGTATTAATTGTCCCAGTAGTGCCTGCTACTCTTGACT 
+4T+8G : TAGA6GTGATATTATGAATAGTGCCTGTTTCTTTTGGCT 



6RVD-TALE-XID off-target cleavage in MEFs 
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Figure 5. Assessment of potential off-target cleavage activity of TALE-XID enzyme. (A) Alignment of seven selected potential XID homologous sites in 
mouse genome. TALE, XID binding sites and mismatch positions are marked in green, red and blue, respectively. (B) Table showing 6RVD-TALE-XID 
off-target cleavage activity in XID MEFs. Potential sites were sequenced following PCR amplification of genomic DNA derived from XID MEF in which 
40% XID site disruption had been achieved by LV TALE-XID expression (as in Figure 4E). 



Catalysis of homologous recombination by the TALE-XID 
enzyme 

As the therapeutic purpose of the TALE-XID enzyme is 
gene repair, rather than gene disruption, we evaluated the 
use of the TALE-XID enzyme for catalysis of HDR. For 
this purpose, TLR cells were transfected with an expres- 
sion vector containing both the 1 IRVD-TALE-XID and 
a dl4GFP donor template (pRRL SFFV dl4GFP EE Is 
IIRVD-TALE-XID T2A mTagBEP) in the absence of 
TREX2 (Eigure 6A). Three days after transfection, NHEJ 
and HDR rates were read out based upon mCherry versus 
GEP positive cells, respectively. As shown in Eigure 6B, a 
clearly detectable GEP+ population was detected, indica- 
tive of HDR. Similar to the NHEJ read out in the TLR as- 
say, the percentage of GEP positive cells can underestimate 



the true HDR rate due to promoter silencing (24). Using a 
pair of primers specific for the inserted TLR target site, we 
amplified the reporter target and sequence analysis revealed 
that 10.3% of the reporter XID target sites were modified by 
HDR and 4.3% by NHEJ (Eigure 6C). This high ratio of 
HDR to NHEJ highlights an important advantage of us- 
ing a homing endonuclease-based reagent for gene editing- 
-the ratio of HDR to NHEJ is typically substantially higher, 
likely due to differential processing of J and 5^ breaks (24). 

DISCUSSION 

Here, we present the results of a progressive re-design strat- 
egy applied to re-specify the cleavage site of an I-Anil 
LAGLIDADG homing endonuclease to a unique mutation 
site in the murine Btk gene. Using yeast surface display 
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Figure 6. Catalysis of homologous recombination by the TALE-XID enzyme. (A) Schematic of vector construct expressing GFP donor template, TALE- 
XID enzyme, and c/^-linked BFP reporter. (B) In the presence of donor template, IIRVD-TALE-XID induced both HDR and NHEJ in TLR cells con- 
taining the native XID locus sequence as shown by GFP and mCherry positive population, respectively. Reporter cells were transfected and HDR and 
NHEJ activity was measured within BFP expressing cells. Histogram showing BFP expression is displayed in upper right quadrant of each FACS plot. (C) 
The percentage of BFP positive cells modified by either HDR or NHEJ in TLR assay was determined by genomic sequencing. Data shown is the average 
of two independent experiments. 



high-throughput cleavage selection, we were able to over- 
come neighbor effects in the highly integrated LHE inter- 
face by progressively assembling individual 'half site' en- 
zymes followed by assembly of the engineered 'half site' into 
an active enzyme. Overall, our engineering effort achieved 
a 9 bp alteration of WT-Anil from its native sequence, and 
yielded an enzyme that exhibits significant specificity for the 
XID versus WT allele. 

The cluster-based engineering approach used in this 
study involved randomization of DNA interface residues, 
and the initial variants isolated from such libraries were 
noted to nearly uniformly have significantly lower cleavage 
activity than the WT enzyme. Based on these observations, 
we infer that simultaneous changes of multiple protein se- 
quences at the DNA interface region results in structural 
or conformational shifts that compromise enzyme activity. 
However, such changes have not been reliably predictable 
using current modeling techniques, making rational designs 
for compensation very difficult. In this study, we used a 
strategy of random mutagenesis following isolation of inter- 
face variants to identify those residues capable of improving 
enzyme activity. Within our final XID-Ani enzyme, 14 of 
3 1 mutations were introduced by random mutagenesis. Of 
these, we speculate that E86D, F91L, D122N and C150S 
may increase XID-Ani solubility and stability on yeast sur- 



face. Notably, three of these residue alterations are naturally 
observed in a recently identified, highly active, I-Anil ho- 
molog, I-Hjel LHE (28). K39R, E63K, M66T, N226Y and 
R243W appear to be critical for enzyme activity, as they ap- 
pear to have occurred in order to compensate for significant 
structural shifts caused by other mutations. The remain- 
ing five mutations (I55T, I64T, H76Q, R172K and K232E) 
may slightly affect enzyme activity through unclear mecha- 
nisms. Importantly, we were consistently able to apply ran- 
dom mutagenesis to identify variants with significantly in- 
creased cleavage activity, if the starting point was an enzyme 
with modest cleavage activity. However, we have not been 
successful in recovering active enzymes from randomly mu- 
tagenized non-active variants. We speculate that recovering 
activity from an inactive variant likely requires simultane- 
ous changes in multiple residues, and thus results in a com- 
binatorial problem whose complexity is beyond our present 
selection methods. 

Our work emphasizes the difficulty that protein engineers 
are likely to encounter when attempting to combine AA 
changes individually selected to allow for altered specificity 
in adjacent or nearby areas of the LHE protein/DNA inter- 
face. Our first attempt to directly combine designs that pro- 
duced single base-pair shifts in cleavage specificity failed to 
generate active variants, an observation which we attribute 
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to neighbor effects between designs due to the highly inte- 
grated LHE protein/DNA interface. We also observed in 
several cases that active variants selected against a cluster 
(e.g. +6T+7G) were able to cut combined +6T and +7G 
but not +6T or +7G individually (Supplementary Figure 
S2D). This is consistent with structure-based predictions 
that these 2 bp are targeted by an overlapping and struc- 
turally dependent area of the protein/DNA interface, and 
thus separate designs would likely generate steric clashes 
that compromise enzyme activity. Similarly, our attempts to 
combine two adjacent regions targeted to — lOA— 8T and 
— 6C— 5C— 4T by a direct combination strategy also did 
not work. This observation could be explained if the struc- 
tural shift caused by incorporations of the — 6C— 5C— 4T 
design had a neighbor effect on the loop region targeted to 
— lOA— 8T. Only after that shift was compensated by muta- 
tions from another region of the interface, was it possible to 
re-design this loop (Supplementary Figure S3G-I). Finally, 
the combination of XID (+) and (— ) half designs, which re- 
side in the I-Anil N-term and C-term domains, respectively, 
and target to distant sequences, unexpectedly did not gen- 
erate any active variants as a direct combination. However, 
following random mutagenesis, a single R243W mutation 
was recovered that allowed all inactive variants to regain 
cleavage activity. We hypothesize that the re-designed half 
enzyme resulted in a shift in the catalytic residues, which was 
subsequently compensated for by the R243W mutation. 

While we achieved significant success in resculpting a 
DNA/protein interface that allowed high specificity and ac- 
tivity toward the XID target, this interface was not able to 
achieve a high binding affinity for the XID site. Despite its 
reduced binding affinity in vitro, XID-Ani exhibited a level 
of activity in in vitro and in cellulo reporter assays that was 
equivalent to WT I- Anil. It also exhibited significantly im- 
proved specificity, critical for reducing the risk of off- target 
cleavage in therapeutic applications. We attribute the re- 
duced binding affinity to the nature of the resculpted in- 
terface of the (— ) half site, which disrupts several hydrogen 
bonds that are involved in generating the high affinity of 
WT I-Anil for its target site. We speculate that to achieve 
the same level of cleavage activity as WT I-Anil, the reduced 
binding affinity was partially compensated by an increased 
turnover rate (i^cat). 

Importantly, the introduction of a site-specific TALE 
DNA binding domain using the megaTAL platform was 
able to overcome the reduced binding affinity of XID-Ani 
for the XID site, thereby providing markedly improved in 
vivo activity. Fusion of XID-Ani to TALE DNA bind- 
ing domains containing as few as six RVDs, resulted in 
markedly enhanced cleavage efficacy in both reporter cell 
lines and primary XID MEFs. The TALE DNA binding 
domain significantly increased XID-Ani cleavage efficacy 
to its target site but is not anticipated to alter the binding 
affinity of the XID-Ani cleavage head to its homologous 
sequence. Thus, we do not anticipate that the megaTAL 
fusion changes the intrinsic specificity of the HE cleav- 
age head. However, by increasing on-site activity, a desired 
level of on-site modification can be achieved with a reduced 
protein expression level and/or a more limited period of 
expression thereby resulting in reduced off site cleavage. 
Thus, megaTAL fusions result in an increase in the effec- 



tive specificity of the enzyme (12). We speculate that future 
LHE engineering for therapeutic applications may focus on 
the intentional development of low affinity, high specificity 
LHE's, with the planned addition of a TALE or other affin- 
ity enhancing domain to provide for site specific activity to 
a desired single target site. 

Finally, a notable aspect of the performance of the XID- 
Ani megaTAL was that it was able to achieve a high ratio 
of HDR to NHEJ. High HDR to NHEJ ratios have typ- 
ically been observed with homing endonucleases, whereas 
both zinc finger nucleases (ZFNs) and TALENs have typ- 
ically yielded the reverse — rates of NHEJ that are higher 
than HDR (24). This is an important observation, as it sug- 
gests that the DNA ends created by megaTAL reagents are 
processed in a manner akin to their homing endonuclease 
cleavage heads, as opposed to being processed equivalently 
to a FOK-I based TALEN. It also supports the concept that 
gene editing reagents based on homing endonuclease cleav- 
age domains may be superior choices for gene editing appli- 
cations that are dependent on homology-directed repair. 

In summary, here we have shown that a yeast sur- 
face display-based high throughput selection system for 
HE engineering can be applied in a progressive re-design 
strategy to execute an aggressive resculpting of an LHE 
protein/DNA interface. By adopting a progressive strategy 
that incorporated random mutagenesis to boost activity be- 
tween interface resculpting steps, we were able to achieve a 
9 bp alteration in cleavage specificity from the native target 
site. Furthermore, we show that compensating XID binding 
affinity through TALE DNA binding domains significantly 
improved the cleavage activity of the final enzyme, yield- 
ing a highly active and specific gene editing reagent able to 
catalyze high rates of homology-directed repair at its target 
site. The success of our progressive strategy in achieving a 
significant specificity shift and successful incorporation of 
the re-designed XID-Ani into the megaTAL format offer a 
roadmap for future LHE engineering projects aimed at cre- 
ating highly specific and active gene editing reagents. 
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